Self-Driving Car Engineer Nanodegree

Deep Learning

Project: Build a Traffic Sign Recognition Classifier

In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with 'Implementation' in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with 'Optional' in the header.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.


Step 0: Load The Data

In [1]:
# Load pickled data
import pickle

# TODO: Fill this in based on where you saved the training and testing data

training_file = './train.p'
testing_file = './test.p'

with open(training_file, mode='rb') as f:
    train = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)
    
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']

Step 1: Dataset Summary & Exploration

The pickled data is a dictionary with 4 key/value pairs:

  • 'features' is a 4D array containing raw pixel data of the traffic sign images, (num examples, width, height, channels).
  • 'labels' is a 2D array containing the label/class id of the traffic sign. The file signnames.csv contains id -> name mappings for each id.
  • 'sizes' is a list containing tuples, (width, height) representing the the original width and height the image.
  • 'coords' is a list containing tuples, (x1, y1, x2, y2) representing coordinates of a bounding box around the sign in the image. THESE COORDINATES ASSUME THE ORIGINAL IMAGE. THE PICKLED DATA CONTAINS RESIZED VERSIONS (32 by 32) OF THESE IMAGES

Complete the basic data summary below.

In [2]:
import pandas as pd
import numpy as np

y_train_df = pd.DataFrame(y_train)
y_train_df.columns=['id']
y_test_df = pd.DataFrame(y_test)
y_test_df.columns=['id']

# found data here https://github.com/domluna/traffic-signs/blob/master/signnames.csv
signnames_file = './signnames.csv'
signnames_df = pd.read_csv(signnames_file)
signnames_df.columns=['id','sign_name']
In [3]:
### Replace each question mark with the appropriate value.

# TODO: Number of training examples
n_train = X_train.data.shape[0]

# TODO: Number of testing examples.
n_test = X_test.data.shape[0]

# TODO: What's the shape of an traffic sign image?
image_shape = [X_train.data.shape[1], X_train.data.shape[2]]

# TODO: How many unique classes/labels there are in the dataset.

n_classes = len(pd.concat([y_train_df,y_test_df])['id'].unique())

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
Number of training examples = 34799
Number of testing examples = 12630
Image data shape = [32, 32]
Number of classes = 43

Visualize the German Traffic Signs Dataset using the pickled file(s). This is open ended, suggestions include: plotting traffic sign images, plotting the count of each sign, etc.

The Matplotlib examples and gallery pages are a great resource for doing visualizations in Python.

NOTE: It's recommended you start with something simple first. If you wish to do more, come back to it after you've completed the rest of the sections.

In [4]:
### Data exploration visualization goes here.
### Feel free to use as many code cells as needed.
import matplotlib.pyplot as plt
# Visualizations will be shown in the notebook.
%matplotlib inline
import cv2
In [5]:
img=X_train[140]
plt.imshow(img)
Out[5]:
<matplotlib.image.AxesImage at 0x7fef9635a4a8>
In [6]:
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY) #grayscale conversion
plt.imshow(gray, cmap='gray')
Out[6]:
<matplotlib.image.AxesImage at 0x7fef90a80b38>
In [7]:
hist,bins = np.histogram(img.flatten(),256,[0,256])
cdf = hist.cumsum()
cdf_normalized = cdf * hist.max()/ cdf.max()

plt.plot(cdf_normalized, color = 'b')
plt.hist(img.flatten(),256,[0,256], color = 'r')
plt.xlim([0,256])
plt.legend(('cdf','histogram'), loc = 'upper left')
plt.show()
In [8]:
cdf_m = np.ma.masked_equal(cdf,0)
cdf_m = (cdf_m - cdf_m.min())*255/(cdf_m.max()-cdf_m.min())
cdf = np.ma.filled(cdf_m,0).astype('uint8')
plt.imshow(cdf[img])
Out[8]:
<matplotlib.image.AxesImage at 0x7fef905f9b38>
In [9]:
plt.imshow(cv2.equalizeHist(gray),cmap='gray')
Out[9]:
<matplotlib.image.AxesImage at 0x7fef905cd128>
In [10]:
cdf_img=img
cv2.normalize(cdf_img, cdf_img, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)
plt.imshow(cdf_img)
Out[10]:
<matplotlib.image.AxesImage at 0x7fef905acf98>
In [11]:
def jitter_image_test(img):
    cols,rows=32,32
    degrees=[-20,-15,-10,-5,0,5,10,15,20]
    Ms=[cv2.getRotationMatrix2D((cols/2,rows/2),d,1) for d in degrees]
    #print(Ms[0])
    positions=np.float32([[[1,0,r[0]],[0,1,r[1]]] for r in np.subtract(np.indices((3,3)).reshape(2,-1).T*2,[2,2])])
    pos_no=len(positions)
    img_no=len(degrees)
    fig = plt.figure(figsize=(pos_no, img_no), dpi=100)
    for p in range(len(positions)):
        for i in range(img_no):
            plt.subplot(len(positions), img_no, (p*img_no)+i+1)
#             plt.imshow(cv2.warpAffine(img,np.add(Ms[i],positions[p]),(cols,rows)),cmap='gray')
            jimg=cv2.warpAffine(img,positions[p],(cols,rows))
            jimg=cv2.warpAffine(jimg,Ms[i],(cols,rows))
            plt.imshow(jimg)
            plt.xticks([]), plt.yticks([])
    plt.show()
jitter_image_test(img)
In [12]:
nosigns=10
imgpersign=30
for i in range(nosigns*imgpersign):
    plt.subplot(nosigns, imgpersign, i+1)
    plt.imshow(X_train[i])
    plt.xticks([]), plt.yticks([])
plt.show()
In [13]:
join_df=y_train_df.join(signnames_df,on=['id'],rsuffix='_sign')
join_df.axes
Out[13]:
[RangeIndex(start=0, stop=34799, step=1),
 Index(['id', 'id_sign', 'sign_name'], dtype='object')]
In [14]:
counts_df= pd.DataFrame(join_df.groupby('sign_name').size().rename('sign_counts'))
counts_df.plot.bar()
Out[14]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fef8326a9e8>
In [15]:
y_train_df.plot.hist(bins=n_classes)
Out[15]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fef830edc88>
In [16]:
y_test_df.plot.hist(bins=n_classes)
Out[16]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fef832c6748>

Step 2: Design and Test a Model Architecture

Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the German Traffic Sign Dataset.

There are various aspects to consider when thinking about this problem:

  • Neural network architecture
  • Play around preprocessing techniques (normalization, rgb to grayscale, etc)
  • Number of examples per label (some have more than others).
  • Generate fake data.

Here is an example of a published baseline model on this problem. It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.

NOTE: The LeNet-5 implementation shown in the classroom at the end of the CNN lesson is a solid starting point. You'll have to change the number of classes and possibly the preprocessing, but aside from that it's plug and play!

Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [17]:
### Preprocess the data here.
### Feel free to use as many code cells as needed.

def normalize(image):
    dest=np.empty(image.shape,dtype=np.float32)
    cv2.normalize(image, dest, alpha=0.1, beta=0.9, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)
    return dest
In [18]:
def yuv_normalized(images):
    
    def equalize_and_normalise(image):    
        yuv=cv2.split(cv2.cvtColor(image, cv2.COLOR_RGB2YUV))
        yuv[0]=cv2.equalizeHist(yuv[0])

        norm=normalize(cv2.cvtColor(cv2.merge(yuv), cv2.COLOR_YUV2RGB))
        return norm
        
    converted_images = np.array([equalize_and_normalise(img) for img in images])

    return converted_images

X_train_yuv = yuv_normalized(X_train)
X_test_yuv = yuv_normalized(X_test)
In [19]:
# images equalized and normalized
nosigns=10
imgpersign=30
fig = plt.figure(figsize=(30, 10), dpi=200)
for i in range(nosigns*imgpersign):
    plt.subplot(nosigns, imgpersign, i+1)
    plt.imshow(X_train_yuv[i],)
    plt.xticks([]), plt.yticks([])
plt.show()
In [20]:
# images with no filtering
nosigns=10
imgpersign=30
fig = plt.figure(figsize=(30, 10), dpi=200)
for i in range(nosigns*imgpersign):
    plt.subplot(nosigns, imgpersign, i+1)
    plt.imshow(X_train[i])
    plt.xticks([]), plt.yticks([])
plt.show()
In [21]:
### Generate data additional data (OPTIONAL!)
### and split the data into training/validation/testing sets here.
### Feel free to use as many code cells as needed.

import random

X_jittered = []
y_jittered =[]
cols,rows=32,32
     
def append(X,y):
    X_jittered.append(X)
    y_jittered.append(y)

def append_as_is(X,y,id):
    for index in y_train_df[y_train_df.id==id].index.tolist():
        append(X[index],y[index])
            
def jittered_data(X, y, id, count, img_total, rotations_M, positions_M):
    IMG_PER_BATCH = 30
    img_required_count = img_total - count
    
    existing_batches=count // IMG_PER_BATCH
    
    id_index=y_train_df[y_train_df.id==id].index.tolist()
    b_id = 0
    while img_required_count > 0:
        # find a new batch every IMG_PER_BATCH counf
        if img_required_count % IMG_PER_BATCH == 0:
            b_id=random.randrange(existing_batches-1)
        b_img=img_required_count % IMG_PER_BATCH
        img_id = id_index[b_id*IMG_PER_BATCH+b_img]
        
        jimg=X[img_id]
        # pick a random matrix number
        r_M=rotations_M[random.randrange(len(rotations_M))]
        jimg=cv2.warpAffine(jimg,r_M,(cols,rows))
        
        # now perturb placement randomly
        p_M=positions_M[random.randrange(len(positions_M))]
        jimg=cv2.warpAffine(jimg,p_M,(cols,rows))
        
        append(jimg,y[img_id])
        img_required_count -= 1
        
def gen_additional_data(X,y):    
    
    # setup the image rotation matrices
    degrees=[-20,-15,-10,-5,5,10,15,20]
    rotations_M=[cv2.getRotationMatrix2D((cols/2,rows/2),d,1) for d in degrees]
    
    positions_M=np.float32([[[1,0,r[0]],[0,1,r[1]]] for r in np.subtract(np.indices((3,3)).reshape(2,-1).T*2,[2,2])])
   
    for id, count in y_train_df.groupby('id').size().rename('sign_counts').iteritems():
        print("id: %d count: %d" %(id, count))

        # append the data as is
        append_as_is(X,y,id)

        #  if we dont have enough images jitter the data to create more
        if count < 1250:
            img_total=count+(85*30)

            jittered_data(X, y, id, count, img_total, rotations_M, positions_M)
        else:
            img_total=count+(55*30)

            jittered_data(X, y, id, count, img_total, rotations_M, positions_M)

# gen_additional_data(X_train,y_train_1hot)
gen_additional_data(X_train_yuv,y_train)
assert len(X_jittered) == len(y_jittered)        
print (len(X_jittered),len(y_jittered))


y_jittered_df = pd.DataFrame(y_jittered)
y_jittered_df.columns=['id']
y_jittered_df.plot.hist(bins=n_classes)
id: 0 count: 180
id: 1 count: 1980
id: 2 count: 2010
id: 3 count: 1260
id: 4 count: 1770
id: 5 count: 1650
id: 6 count: 360
id: 7 count: 1290
id: 8 count: 1260
id: 9 count: 1320
id: 10 count: 1800
id: 11 count: 1170
id: 12 count: 1890
id: 13 count: 1920
id: 14 count: 690
id: 15 count: 540
id: 16 count: 360
id: 17 count: 990
id: 18 count: 1080
id: 19 count: 180
id: 20 count: 300
id: 21 count: 270
id: 22 count: 330
id: 23 count: 450
id: 24 count: 240
id: 25 count: 1350
id: 26 count: 540
id: 27 count: 210
id: 28 count: 480
id: 29 count: 240
id: 30 count: 390
id: 31 count: 690
id: 32 count: 210
id: 33 count: 599
id: 34 count: 360
id: 35 count: 1080
id: 36 count: 330
id: 37 count: 180
id: 38 count: 1860
id: 39 count: 270
id: 40 count: 300
id: 41 count: 210
id: 42 count: 210
132749 132749
Out[21]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fef83a39390>
In [22]:
signnames_df
Out[22]:
id sign_name
0 0 Speed limit (20km/h)
1 1 Speed limit (30km/h)
2 2 Speed limit (50km/h)
3 3 Speed limit (60km/h)
4 4 Speed limit (70km/h)
5 5 Speed limit (80km/h)
6 6 End of speed limit (80km/h)
7 7 Speed limit (100km/h)
8 8 Speed limit (120km/h)
9 9 No passing
10 10 No passing for vehicles over 3.5 metric tons
11 11 Right-of-way at the next intersection
12 12 Priority road
13 13 Yield
14 14 Stop
15 15 No vehicles
16 16 Vehicles over 3.5 metric tons prohibited
17 17 No entry
18 18 General caution
19 19 Dangerous curve to the left
20 20 Dangerous curve to the right
21 21 Double curve
22 22 Bumpy road
23 23 Slippery road
24 24 Road narrows on the right
25 25 Road work
26 26 Traffic signals
27 27 Pedestrians
28 28 Children crossing
29 29 Bicycles crossing
30 30 Beware of ice/snow
31 31 Wild animals crossing
32 32 End of all speed and passing limits
33 33 Turn right ahead
34 34 Turn left ahead
35 35 Ahead only
36 36 Go straight or right
37 37 Go straight or left
38 38 Keep right
39 39 Keep left
40 40 Roundabout mandatory
41 41 End of no passing
42 42 End of no passing by vehicles over 3.5 metric ...
In [23]:
# display a random batch of 10 signs
#jittered_names_df=y_jittered_df.join(signnames_df,on=['id'],rsuffix='_sign')
nosigns=10
imgpersign=30
fig = plt.figure(figsize=(30, 10), dpi=200)
startindex = random.randrange(len(X_jittered)//imgpersign-nosigns)*imgpersign
for i in range(nosigns*imgpersign):
    plt.subplot(nosigns, imgpersign, i+1)
    plt.title(y_jittered_df.iloc[startindex + i].id)
    plt.imshow(X_jittered[startindex + i])
    plt.xticks([]), plt.yticks([])
plt.show()
In [24]:
def batch(inputs, targets, batchsize):
    assert len(inputs) == len(targets)
    for start_idx in range(0, len(inputs), batchsize):
        batch_slice = slice(start_idx, start_idx + batchsize)
        yield inputs[batch_slice], targets[batch_slice]
In [25]:
import random
from sklearn.model_selection import train_test_split
X_train_split, X_val_split, y_train_split, y_val_split = train_test_split(X_jittered, y_jittered, test_size=0.10, random_state=74723)

assert len(X_train_split) == len(y_train_split)
assert len(X_val_split) == len(y_val_split)

print("Number of training examples =", len(X_train_split))
print("Number of validation examples =", len(X_val_split))
print("Number of testing examples =", len(X_test))
Number of training examples = 119474
Number of validation examples = 13275
Number of testing examples = 12630
In [26]:
### Define your architecture here.
### Feel free to use as many code cells as needed.
In [27]:
import tensorflow as tf
from tensorflow.contrib.layers import flatten
import os
print(tf.__version__)
0.12.1
In [28]:
# using https://www.tensorflow.org/how_tos/summaries_and_tensorboard/#launching_tensorboard
def variable_summaries(var):
    """Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
    with tf.name_scope('summaries'):
        mean = tf.reduce_mean(var)
        tf.summary.scalar('mean', mean)
        with tf.name_scope('stddev'):
            stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
        tf.summary.scalar('stddev', stddev)
        tf.summary.scalar('max', tf.reduce_max(var))
        tf.summary.scalar('min', tf.reduce_min(var))
        tf.summary.histogram('histogram', var)

def nn_layer(input_tensor, W, b, layer_name, act=tf.nn.relu):
    """Reusable code for making a simple neural net layer.

    It does a matrix multiply, bias add, and then uses relu to nonlinearize.
    It also sets up name scoping so that the resultant graph is easy to read,
    and adds a number of summary ops.
    """
    # Adding a name scope ensures logical grouping of the layers in the graph.
    with tf.name_scope(layer_name):
        # This Variable will hold the state of the weights for the layer
        with tf.name_scope('weights'):
            variable_summaries(W)
        with tf.name_scope('biases'):
            variable_summaries(b)
        with tf.name_scope('Wx_plus_b'):
            preactivate = tf.matmul(input_tensor, W) + b
            tf.summary.histogram('pre_activations', preactivate)
        activations = act(preactivate, name='activation')
        tf.summary.histogram('activations', activations)
        return activations
In [29]:
def conv2d(x, W, b, layer_name='conv1', strides=1):
    # Conv2D wrapper, with bias and relu activation
    with tf.name_scope(layer_name):
        tf.summary.scalar('strides', strides)
        with tf.name_scope('weights'):
            variable_summaries(W)
        with tf.name_scope('biases'):
            variable_summaries(b)
        with tf.name_scope('conv2d_plus_b'):
            conv = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='VALID') + b
            tf.summary.histogram('pre_activations',conv)
        x = tf.nn.relu(conv)
        tf.summary.histogram('activations', x)
        return x
    
def maxpool2d(x, layer_name='maxpool1', k=2):
    with tf.name_scope(layer_name):
        tf.summary.scalar('k', k)
        with tf.name_scope('max_pool'):
            x = tf.nn.max_pool(
                x,
                ksize=[1, k, k, 1],
                strides=[1, k, k, 1],
                padding='VALID')
            tf.summary.histogram('post_max_pool', x)
            return x

def conv_net(x, weights, biases, keep_prob):

    print("shape ", x)
    with tf.name_scope('input'):
        tf.summary.image('input', x, 100) 
       
    # Convolution Layer
    conv1 = conv2d(x, weights['wc1'], biases['bc1'], 'conv1', strides=1)
    print("conv1 shape ", conv1)
    
    # Max Pooling (down-sampling)
    conv1 = maxpool2d(conv1, 'maxpool1', k=2)
    print("conv1 after maxpool shape ", conv1)
    
    # Convolution Layer
    conv2 = conv2d(conv1, weights['wc2'], biases['bc2'], 'conv2', strides=1)
    print("conv2 shape ", conv2)
        
    # Max Pooling (down-sampling)
    conv2 = maxpool2d(conv2, 'maxpool2', k=2)
    print("conv2 after maxpool shape ", conv2)
    
    
    # Reshape conv1 output to fit fully connected layer input
    with tf.name_scope('flatten'):
        flat1 = flatten(conv2) 
    print("flat1 shape", flat1)
    
    # Fully connected layer
    fc1 = nn_layer(flat1, weights['wd1'], biases['bd1'],'fc1')
    print("fc1 shape", fc1)
    
    # Fully connected layer
    fc2 = nn_layer(fc1, weights['wd2'], biases['bd2'],'fc2')
    print("fc2 shape", fc2)
    
    # Apply Dropout
    with tf.name_scope('dropout'):    
        tf.summary.scalar('dropout_keep_probability', keep_prob)
        dropped = tf.nn.dropout(fc2, keep_prob)

    # Output, class prediction
    out = nn_layer(dropped, weights['out'], biases['out'], 'out', act=tf.identity)
    
    print("out shape ", out)
    return out
In [30]:
# Hyperparameters
mu = 0
sigma = 0.1

def weight_variable(shape):
    w = tf.truncated_normal(shape, mean = mu, stddev = sigma)
    return tf.Variable(w, name='weights')

def bias_variable(shape):
    b = tf.zeros(shape=shape)
    return tf.Variable(b, name='biases')

# Store layers weight & bias
weights = {
    'wc1': weight_variable([5, 5, 3, 12]),
    'wc2': weight_variable([5, 5, 12, 32]),
    'wd1': weight_variable([5*5*32, 240]),
    'wd2': weight_variable([240, 168]),
    'out': weight_variable([168, n_classes])
}

biases = {
    'bc1': bias_variable([12]),
    'bc2': bias_variable([32]),
    'bd1': bias_variable([240]),
    'bd2': bias_variable([168]),
    'out': bias_variable([n_classes])
}

with tf.name_scope('input'):
    #  consists of 32x32x3, color images
    x = tf.placeholder(tf.float32, (None,32,32,3), name='x-input')
    # Classify over 43 unique label classes
    y = tf.placeholder(tf.float32, (None, n_classes), name='y-input')

keep_prob = tf.placeholder(tf.float32, name='keep-prob') #dropout (keep probability)

# Construct the model
logits = conv_net(x, weights, biases, keep_prob)

   
with tf.name_scope('cross_entropy'):
    # Define loss and optimizer
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, y))
    #cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(prediction + 1e-6), reduction_indices=1))
    with tf.name_scope('total'):
        cross_entropy = tf.reduce_mean(cross_entropy)
tf.summary.scalar('cross_entropy', cross_entropy)

learning_rate = 0.001

with tf.name_scope('train'):
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)
    
with tf.name_scope('accuracy'):
    with tf.name_scope('correct_prediction'):
        #correct_prediction = tf.equal(tf.argmax(out, 1), tf.argmax(y, 1))
        correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
    with tf.name_scope('accuracy'):    
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
tf.summary.scalar('accuracy', accuracy)
shape  Tensor("input/x-input:0", shape=(?, 32, 32, 3), dtype=float32)
conv1 shape  Tensor("conv1/Relu:0", shape=(?, 28, 28, 12), dtype=float32)
conv1 after maxpool shape  Tensor("maxpool1/max_pool/MaxPool:0", shape=(?, 14, 14, 12), dtype=float32)
conv2 shape  Tensor("conv2/Relu:0", shape=(?, 10, 10, 32), dtype=float32)
conv2 after maxpool shape  Tensor("maxpool2/max_pool/MaxPool:0", shape=(?, 5, 5, 32), dtype=float32)
flat1 shape Tensor("flatten/Flatten/Reshape:0", shape=(?, 800), dtype=float32)
fc1 shape Tensor("fc1/activation:0", shape=(?, 240), dtype=float32)
fc2 shape Tensor("fc2/activation:0", shape=(?, 168), dtype=float32)
out shape  Tensor("out/activation:0", shape=(?, 43), dtype=float32)
Out[30]:
<tf.Tensor 'accuracy_1:0' shape=() dtype=string>
In [31]:
def eval_data(X_data, y_data, writer, step):
    """
    Given X_data, y_data as input returns the loss and accuracy.
    """

    num_examples = len(X_data)

    total_acc, total_loss = 0., 0.
    batch_size=[]
    for batch_x, batch_y in batch(X_data, y_data, BATCH_SIZE):
        size=len(batch_x)
        batch_size.append(size)
        summary, acc = sess.run([merged, accuracy], feed_dict={x: batch_x, y: batch_y, keep_prob: 1.})
        writer.add_summary(summary, step)

        total_acc += (acc * size)

    return total_acc/num_examples

EPOCHS = 30
BATCH_SIZE = 1000
dropout_keep_prob = 0.25 # Dropout, probability to keep units

saver = tf.train.Saver()

y_train_1hot = tf.one_hot(y_train_split, n_classes).eval(session=tf.Session())
y_val_1hot = tf.one_hot(y_val_split, n_classes).eval(session=tf.Session())
y_test_1hot = tf.one_hot(y_test, n_classes).eval(session=tf.Session())


log_dir='./logs' 
if tf.gfile.Exists(log_dir):
    tf.gfile.DeleteRecursively(log_dir)
tf.gfile.MakeDirs(log_dir)

with tf.Session() as sess:
    merged = tf.summary.merge_all()
    train_writer = tf.summary.FileWriter(log_dir + '/train', sess.graph)
    val_writer = tf.summary.FileWriter(log_dir + '/validation')
    test_writer = tf.summary.FileWriter(log_dir + '/test')

    sess.run(tf.global_variables_initializer())

    val_accuracies=[]
    step = 0
    # Train model
    for i in range(EPOCHS):            
        for batch_x, batch_y in batch(X_train_split, y_train_1hot, BATCH_SIZE):
            step += 1
            
            if step % 1000 == 0: # record a summery with trace               
                run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
                run_metadata = tf.RunMetadata()
                summary, _ = sess.run([merged,  optimizer],
                                      feed_dict={x: batch_x, y: batch_y, keep_prob: dropout_keep_prob},
                                      options=run_options,
                                      run_metadata=run_metadata)
                train_writer.add_run_metadata(run_metadata, 'epoch%03d-step%04d' % (i,step))
                train_writer.add_summary(summary, step)
                print('Adding run metadata for ', step)
            elif step % 100 == 1: # record a summary
                summary, _ = sess.run([merged, optimizer], feed_dict={x: batch_x, y: batch_y, keep_prob: dropout_keep_prob})
                train_writer.add_summary(summary, step)
            else:
                _ = sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: dropout_keep_prob})  
        
        val_acc = eval_data(X_val_split, y_val_1hot, val_writer, step)
        val_accuracies.append(val_acc)
        print("EPOCH {:3d} Validation accuracy = {:.3f}".format(i+1, val_acc))
        
        # save a model checkpoint for tensorboard
        saver.save(sess, log_dir+'/model.ckpt', step)
        
        if i+1 > 10 and val_acc < 0.10:
            print("Oh not so good ... exit")
            break

    # Evaluate on the test data
    test_acc = eval_data(X_test_yuv, y_test_1hot, test_writer, step)
    print("Test accuracy = {:.3f}".format(test_acc))

    saver.save(sess, os.getcwd() + "/cnn-run")

train_writer.close()
val_writer.close()
test_writer.close()
EPOCH   1 Validation accuracy = 0.494
EPOCH   2 Validation accuracy = 0.635
EPOCH   3 Validation accuracy = 0.736
EPOCH   4 Validation accuracy = 0.811
EPOCH   5 Validation accuracy = 0.853
EPOCH   6 Validation accuracy = 0.878
EPOCH   7 Validation accuracy = 0.893
EPOCH   8 Validation accuracy = 0.908
Adding run metadata for  1000
EPOCH   9 Validation accuracy = 0.925
EPOCH  10 Validation accuracy = 0.925
EPOCH  11 Validation accuracy = 0.942
EPOCH  12 Validation accuracy = 0.950
EPOCH  13 Validation accuracy = 0.951
EPOCH  14 Validation accuracy = 0.954
EPOCH  15 Validation accuracy = 0.960
EPOCH  16 Validation accuracy = 0.960
Adding run metadata for  2000
EPOCH  17 Validation accuracy = 0.965
EPOCH  18 Validation accuracy = 0.964
EPOCH  19 Validation accuracy = 0.967
EPOCH  20 Validation accuracy = 0.973
EPOCH  21 Validation accuracy = 0.974
EPOCH  22 Validation accuracy = 0.974
EPOCH  23 Validation accuracy = 0.977
EPOCH  24 Validation accuracy = 0.975
Adding run metadata for  3000
EPOCH  25 Validation accuracy = 0.976
EPOCH  26 Validation accuracy = 0.976
EPOCH  27 Validation accuracy = 0.981
EPOCH  28 Validation accuracy = 0.982
EPOCH  29 Validation accuracy = 0.980
EPOCH  30 Validation accuracy = 0.980
Test accuracy = 0.925
In [32]:
# plot the validation losses and accuracies for each epoch
fig, ax1 = plt.subplots()
e = np.arange(0, EPOCHS, 1)
ax1.set_xlabel('Epochs')
ax1.plot(e, np.multiply(val_accuracies,100),'b-')
ax1.set_ylabel('Accuracy', color='b')
for tl in ax1.get_yticklabels():
    tl.set_color('b')
plt.show()

Step 3: Test a Model on New Images

Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.

You may find signnames.csv useful as it contains mappings from the class id (integer) to the actual sign name.

Implementation

Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.

In [33]:
### Load the images and plot them here.
### Feel free to use as many code cells as needed.
In [34]:
X_predict=[]
y_predict=[]
for i in range(20):
    r = random.randrange(len(X_test_yuv))
    X_predict.append(X_test_yuv[r])
    y_predict.append(y_test[r])

predict_count=len(X_predict)
y_predict_df = pd.DataFrame(y_predict)
y_predict_df.columns=['id']
In [35]:
fig = plt.figure(figsize=(20, predict_count), dpi=200)

for i in range(predict_count):
    plt.subplot(1, predict_count, i+1)
    plt.title(y_predict_df.iloc[i].id)
    plt.imshow(X_predict[i])
    plt.xticks([]), plt.yticks([])
plt.show()
In [36]:
print("Number of predictions: %d" % (len(X_predict)))

with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, os.getcwd() + "/cnn-run")
    print("Model restored.")
   
    ps=sess.run(tf.argmax(logits,1),feed_dict={x:X_predict, keep_prob: 1.0})
    print ([(p,signnames_df.iloc[p].sign_name) for p in ps])
Number of predictions: 20
Model restored.
[(29, 'Bicycles crossing'), (13, 'Yield'), (31, 'Wild animals crossing'), (25, 'Road work'), (16, 'Vehicles over 3.5 metric tons prohibited'), (7, 'Speed limit (100km/h)'), (36, 'Go straight or right'), (3, 'Speed limit (60km/h)'), (27, 'Pedestrians'), (3, 'Speed limit (60km/h)'), (42, 'End of no passing by vehicles over 3.5 metric tons'), (13, 'Yield'), (18, 'General caution'), (37, 'Go straight or left'), (17, 'No entry'), (7, 'Speed limit (100km/h)'), (19, 'Dangerous curve to the left'), (35, 'Ahead only'), (5, 'Speed limit (80km/h)'), (16, 'Vehicles over 3.5 metric tons prohibited')]
In [37]:
#from os import scandir
from os.path import isfile, join
road_signs_path='./signs/'
def image_file(file):
    if (file.endswith('.jpg') or file.endswith('.png')):
         return True
    else: 
         return False

def load_image(path):
    img = cv2.cvtColor(cv2.imread(path, cv2.IMREAD_COLOR), cv2.COLOR_BGR2RGB)
    return cv2.resize(img,(32,32), interpolation = cv2.INTER_AREA)
    
test_images=[load_image('./signs/' + f) for f in os.listdir('./signs/') if image_file(f)]


test_images=yuv_normalized(test_images)
test_image_count=len(test_images)

fig = plt.figure(figsize=(20, test_image_count), dpi=200)
for i in range(test_image_count):
    plt.subplot(1, test_image_count, i+1)
    plt.imshow(test_images[i])
    plt.xticks([]), plt.yticks([])
plt.show()

print("Number of predictions: %d" % (len(test_images)))

with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, os.getcwd() + "/cnn-run")
    print("Model restored.")
   
    ps=sess.run(tf.argmax(logits,1),feed_dict={x:test_images, keep_prob: 1.0})
    print ([(p,signnames_df.iloc[p].sign_name) for p in ps])
Number of predictions: 5
Model restored.
[(26, 'Traffic signals'), (11, 'Right-of-way at the next intersection'), (8, 'Speed limit (120km/h)'), (13, 'Yield'), (9, 'No passing')]
In [38]:
### Visualize the softmax probabilities here.
### Feel free to use as many code cells as needed.
In [39]:
def softmax_image_explore(timage):
    with tf.Session() as sess:
        # Restore variables from disk.
        saver.restore(sess, os.getcwd() + "/cnn-run")
        print("Model restored.")
    
        image_logits=sess.run(logits,feed_dict={x:[timage], keep_prob: 1.0})
        print(image_logits)
            
        softmax=sess.run(tf.nn.softmax(image_logits))
        print(softmax)

        # experimented with softmax calculations 
        # per https://carnd-forums.udacity.com/questions/12619143/one-reason-for-low-accuracy-ill-conditioned-value-for-log-calculation
        # eventually realised I'd not normalised my captured test images 
        # - hence it was giving me large logit values
        image_logits -= np.max(image_logits)
        print(image_logits)

        predictions=sess.run(tf.nn.softmax(image_logits))
        print(predictions)

        probs = np.exp(image_logits) / np.sum(np.exp(image_logits))
        print(probs)
In [40]:
softmax_image_explore(X_test_yuv[1000])
Model restored.
[[-110.68930817  -69.89345551  -85.80612946  -73.24155426  -59.94990158
   -41.8735466   -89.2749176   -64.95406342  -79.65623474  -35.46466064
   -42.16519928  -60.78334045  -20.25019455  -12.30007458   27.85941887
   -25.38098907  -36.40685654   86.88309479  -50.01707458 -107.60930634
   -48.60890198  -76.29397583  -29.65627098  -95.69667816 -148.64001465
   -47.12564468  -28.96710014 -132.22109985  -80.96204376  -59.69469833
   -82.26026154  -84.83629608  -37.64166641  -37.1951561   -11.77512836
   -50.80939102  -55.45547104  -49.51015854  -23.50367165  -34.62843323
   -60.48906708  -44.69578171  -83.64563751]]
[[  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   2.32457772e-26   0.00000000e+00
    0.00000000e+00   1.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00]]
[[-197.57240295 -156.77655029 -172.68922424 -160.12464905 -146.83299255
  -128.75663757 -176.15802002 -151.8371582  -166.53933716 -122.34775543
  -129.04829407 -147.66644287 -107.13328552  -99.1831665   -59.02367401
  -112.26408386 -123.28994751    0.         -136.900177   -194.49240112
  -135.49200439 -163.17706299 -116.53936768 -182.57977295 -235.52310181
  -134.00874329 -115.85019684 -219.10418701 -167.84513855 -146.57778931
  -169.14335632 -171.71939087 -124.52476501 -124.07824707  -98.65822601
  -137.69248962 -142.33856201 -136.39324951 -110.38676453 -121.51152802
  -147.37216187 -131.57887268 -170.5287323 ]]
[[  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   2.32457772e-26   0.00000000e+00
    0.00000000e+00   1.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00]]
[[  0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   8.40779079e-44   2.32457772e-26   0.00000000e+00
    0.00000000e+00   1.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   1.42932443e-43   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00   0.00000000e+00
    0.00000000e+00   0.00000000e+00   0.00000000e+00]]
In [41]:
softmax_image_explore(test_images[0])
Model restored.
[[-37.65011597 -22.31915474 -25.20582771 -31.47193909 -10.56943607
  -11.19386101 -19.24780846 -24.08270454 -18.12680626 -31.67955017
  -25.52492332 -15.00748539 -12.91357613  -6.90821505 -20.59121513
  -12.25382519 -34.44514847  -9.52122021   8.37997532 -14.89797974
    5.36537552  -4.19267941   5.53355694 -26.41065025 -17.29397964
    5.65445185  11.35988617 -16.81739235 -14.05293751 -13.61821747
  -14.10123444  -5.67723131 -15.57600784 -14.53693771 -21.94241905
  -23.26902771 -19.64559937 -18.4302063  -15.62522411  -7.46717691
  -20.35674477 -23.8143158  -26.11409569]]
[[  4.89889539e-22   2.22972297e-15   1.24332525e-16   2.36181727e-19
    2.82545543e-10   1.51322579e-10   4.80970888e-14   3.82253025e-16
    1.47558371e-13   1.91903097e-19   9.03652798e-17   3.33939755e-12
    2.71044714e-11   1.09935092e-08   1.25511834e-14   5.24284123e-11
    1.20780801e-20   8.05975731e-10   4.79416326e-02   3.72585036e-12
    2.35227868e-03   1.66141561e-07   2.78310152e-03   3.72679310e-17
    3.39356256e-13   3.14074755e-03   9.43781972e-01   5.46556561e-13
    8.67407101e-12   1.33973561e-11   8.26508827e-12   3.76483342e-08
    1.89130504e-12   5.34593854e-12   3.24985112e-15   8.62433171e-16
    3.23117421e-14   1.08942891e-13   1.80047362e-12   6.28611430e-09
    1.58676969e-14   4.99930757e-16   5.01334185e-17]]
[[-49.01000214 -33.679039   -36.56571198 -42.83182526 -21.92932129
  -22.55374718 -30.60769463 -35.44258881 -29.48669243 -43.03943634
  -36.8848114  -26.36737061 -24.2734623  -18.26810074 -31.9511013
  -23.61371231 -45.80503464 -20.88110733  -2.97991085 -26.25786591
   -5.99451065 -15.55256557  -5.82632923 -37.77053833 -28.65386581
   -5.70543432   0.         -28.17727852 -25.41282272 -24.97810364
  -25.46112061 -17.037117   -26.93589401 -25.89682388 -33.30230713
  -34.62891388 -31.00548553 -29.79009247 -26.98511124 -18.82706261
  -31.71663094 -35.17420197 -37.47398376]]
[[  4.89889539e-22   2.22972297e-15   1.24332525e-16   2.36181727e-19
    2.82545543e-10   1.51322579e-10   4.80970888e-14   3.82253025e-16
    1.47558371e-13   1.91903097e-19   9.03652798e-17   3.33939755e-12
    2.71044714e-11   1.09935092e-08   1.25511834e-14   5.24284123e-11
    1.20780801e-20   8.05975731e-10   4.79416326e-02   3.72585036e-12
    2.35227868e-03   1.66141561e-07   2.78310152e-03   3.72679310e-17
    3.39356256e-13   3.14074755e-03   9.43781972e-01   5.46556561e-13
    8.67407101e-12   1.33973561e-11   8.26508827e-12   3.76483342e-08
    1.89130504e-12   5.34593854e-12   3.24985112e-15   8.62433171e-16
    3.23117421e-14   1.08942891e-13   1.80047362e-12   6.28611430e-09
    1.58676969e-14   4.99930757e-16   5.01334185e-17]]
[[  4.89889539e-22   2.22972297e-15   1.24332525e-16   2.36181727e-19
    2.82545543e-10   1.51322579e-10   4.80970888e-14   3.82253025e-16
    1.47558371e-13   1.91903097e-19   9.03652865e-17   3.33939755e-12
    2.71044732e-11   1.09935092e-08   1.25511843e-14   5.24284158e-11
    1.20780801e-20   8.05975786e-10   4.79416363e-02   3.72585036e-12
    2.35227891e-03   1.66141561e-07   2.78310152e-03   3.72679343e-17
    3.39356256e-13   3.14074755e-03   9.43781972e-01   5.46556561e-13
    8.67407101e-12   1.33973561e-11   8.26508827e-12   3.76483342e-08
    1.89130504e-12   5.34593854e-12   3.24985112e-15   8.62433171e-16
    3.23117421e-14   1.08942898e-13   1.80047372e-12   6.28611474e-09
    1.58676969e-14   4.99930757e-16   5.01334185e-17]]
In [42]:
from pylab import *
from matplotlib import gridspec
    
def topk_plot(test_images):
    with tf.Session() as sess:
        # Restore variables from disk.
        saver.restore(sess, os.getcwd() + "/cnn-run")
        print("Model restored.")
        softmax=tf.nn.softmax(logits)

        probs=sess.run(softmax,feed_dict={x:test_images, keep_prob: 1.0})

        (values,indices)=sess.run(tf.nn.top_k(probs, k = 5))
        print("top_k values, indices")
        print(values,indices)

    indices_names=np.array([signnames_df.iloc[i].sign_name for i in indices.reshape(25)])
    image_values=np.array([p for p in zip(indices_names,values.reshape(25))])

    image_values=image_values.reshape(5,5,2)

    pos = arange(5) + .5    # the bar centers on the y axis

    fig = plt.figure(figsize=(10,6))
    gs = gridspec.GridSpec(len(test_images),2,width_ratios=[1,5])

    for i in range(len(image_values)):
        values=[float(v[1]) for v in  image_values[i]]
        labels=[v[0] for v in image_values[i]]

        plt.subplot(gs[i*2])
        plt.imshow(test_images[i])
        plt.xticks([]), plt.yticks([])
        plt.subplot(gs[i*2+1])
        plt.tight_layout()
        barh(-pos,values, align='center')
        yticks(-pos, labels)
        xlabel('softmax probabilities')

    show()
In [43]:
# plot top_k of 5 test images - for reference
topk_plot(X_test_yuv[100:105])
Model restored.
top_k values, indices
[[  9.99999523e-01   3.38545988e-07   4.60469280e-08   3.56176173e-08
    1.04962392e-13]
 [  1.00000000e+00   5.77366528e-08   4.65995992e-12   1.12003343e-13
    1.74873340e-14]
 [  9.80123878e-01   1.94353089e-02   2.70967663e-04   1.69142630e-04
    6.14110718e-07]
 [  9.99999881e-01   1.56198183e-07   1.45980819e-08   4.27757962e-09
    1.82141946e-09]
 [  1.00000000e+00   2.16945089e-17   4.38671539e-22   2.79181942e-26
    7.97545939e-32]] [[ 1  4  0  2  7]
 [10  5 23  9 42]
 [ 5  3  7  2  8]
 [11 21 20 30 10]
 [33 35 39 36 40]]
In [44]:
#plot top k of captured images
topk_plot(test_images)
Model restored.
top_k values, indices
[[  9.43782330e-01   4.79413308e-02   3.14076222e-03   2.78310524e-03
    2.35228194e-03]
 [  9.00111139e-01   9.88779962e-02   5.81709610e-04   4.16040042e-04
    1.30011831e-05]
 [  5.22074103e-01   4.03456092e-01   4.61749509e-02   1.96869411e-02
    4.24932456e-03]
 [  1.00000000e+00   7.09952444e-11   1.96239171e-11   2.19223012e-12
    1.45592360e-12]
 [  5.84236920e-01   4.15534019e-01   1.92516600e-04   2.67877258e-05
    6.24982749e-06]] [[26 18 25 22 20]
 [11  1  4  5 18]
 [ 8  9 14  5  7]
 [13  9  3  4 14]
 [ 9 13 23 20 10]]

Note: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to \n", "File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.